12 research outputs found

    Large-scale Machine Learning in High-dimensional Datasets

    Get PDF

    Dimensionality reduction for click-through rate prediction: Dense versus sparse representation

    Get PDF
    In online advertising, display ads are increasingly being placed based on real-time auctions where the advertiser who wins gets to serve the ad. This is called real-time bidding (RTB). In RTB, auctions have very tight time constraints on the order of 100ms. Therefore mechanisms for bidding intelligently such as clickthrough rate prediction need to be sufficiently fast. In this work, we propose to use dimensionality reduction of the user-website interaction graph in order to produce simplified features of users and websites that can be used as predictors of clickthrough rate. We demonstrate that the Infinite Relational Model (IRM) as a dimensionality reduction offers comparable predictive performance to conventional dimensionality reduction schemes, while achieving the most economical usage of features and fastest computations at run-time. For applications such as real-time bidding, where fast database I/O and few computations are key to success, we thus recommend using IRM based features as predictors to exploit the recommender effects from bipartite graphs.Comment: Presented at the Probabilistic Models for Big Data workshop at NIPS 201

    Semi-supervised Eigenvectors for Locally-biased Learning

    No full text
    In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks “nearby” that pre-specified target region. Locally-biased problems of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities. In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned on being well-correlated with an input seed set of nodes that is assumed to be provided in a semi-supervised manner. We also provide several empirical examples demonstrating how these semi-supervised eigenvectors can be used to perform locally-biased learning.

    A Randomized Heuristic for Kernel Parameter Selection with Large-scale Multi-class Data

    No full text
    Goal: Develop an efficient heuristic for kernel parameter selection with largescale multi-class data. Idea: Measure class dispersion by the radius of the minimum enclosing ball of the class means in the Reproducing Kernel Hilbert Space (RKHS) and choos

    Summary

    No full text
    www.imm.dtu.d

    Personalized Audio Systems - a Bayesian Approach

    No full text
    Modern audio systems are typically equipped with several user-adjustable parameters unfamiliar to most users listening to the system. To obtain the best possible setting, the user is forced into multi-parameter optimization with respect to the users's own objective and preference. To address this, the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is selected by model-based sequential experimental design. A Gaussian process model is proposed which incorporates correlation among particular parameters providing better modeling capabilities compared to a standard model. A ve-band equalizer is considered for demonstration purposes, in which the parameters are optimized using the proposed framework. Twelve test subjects obtain a personalized setting with the framework, and these settings are signicantly preferred to those obtained with random experimentation
    corecore